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Abstract 

This contribution to the debate on confidence limits focuses mostly on the 
case of measurements with 'open likelihood', in the sense that it is defined in 
the text. I will show that, though a prior-free assessment of confidence is, in 
general, not possible, still a search result can be reported in a mostly unbiased 
and efficient way, which satisfies some desiderata which I believe are shared by 
the people interested in the subject. The simpler case of 'closed likelihood' will 
also be treated, and I will discuss why a uniform prior on a sensible quantity is 
a very reasonable choice for most applications. In both cases, I think that much 
clarity will be achieved if we remove from scientific parlance the misleading 
expressions 'confidence intervals' and 'confidence levels'. 



"You see, a question has arisen, 
about which we cannot come to an agreement, 
probably because we have read too many books" 

(Brecht's Galileo)[] 



1 INTRODUCTION 

The blooming of papers on 'limits' in the past couple of years [p]~pl|] and a workshop [|2|] entirely 
dedicated to the subject are striking indicators of the level of the problem. It is difficult not to agree that 
at the root of the problem is the standard physicist's education in statistics, based on the collection of 
frequentistic prescriptions, given the lofty name of 'classical statistical theory' by the their supporters, 
'frequentistic adhoc-eries'^by their opponents. In fact, while in routine measurements characterised by 
a narrow likelihood 'correct numbers' are obtained by frequentistic prescriptions (though the intuitive 
interpretation that physicists attribute to them is that of probabilistic statements^ about true values [|i~5|]), 

1 "Sehen Sie, es ist eine Frage enstanden, ilber die wir uns nicht einig werden konnen, wahrscheinlich, weil wir zu viele 
Biicher gelesen haben." (Bertolt Brecht, Leben des Galilei). 

2 For example, even Sir Ronald Fisher used to refer to Neyman's statistical confidence method as "that technological and 
commercial apparatus which is known as an acceptance procedure" jl3|. In my opinion, the term 'classical' is misleading, as 
are the results of these methods. The name gives the impression of being analogous to 'classical physics', which was developed 
by our 'classicals', and that still holds for ordinary problems. Instead, the classicals of probability theory, like Laplace, Gauss, 
Bayes, Bernoulli and Poisson, had an approach to the problem more similar to what we would call nowadays 'Bayesian' (for 
an historical account see Ref. [|l4|]). 

3 It is a matter of fact [ |l5| l that confidence levels are intuitively thought of (and usually taught) by the large majority of 
physicists as degrees of belief on true values, although the expression 'degree of belief is avoided, because "beliefs are not 
scientific". Even books which do insist on stating that probability statements are not referred to true values ("true values are 
constants of unknown value") have a hard time explaining the real meaning of the result, i.e. something which maps into the 
human mind's perception of uncertain events. So, they are forced to use ambiguous sentences which remain stamped in the 
memory of the reader much more than the frequentistically-correct twisted reasoning that they try to explain. For example a 
classical particle physics statistics book Jig} ] speaks about "the faith we attach to this statement", as if 'faith' was not the same 
as degree of belief. Another one Jl7| | introduces the argument by saying that "we want to find the range . . . which contains the 
true value O with probability /3", though rational people are at a loss in trying to convince themselves that the proposition "the 
range contains 6 with probability /3" does not imply "# is in that range with probability /3". 



they fail in "difficult cases: small or unobserved signal, background larger than signal, background not 
well known, and measurements near a physical boundary" [|I2|]. 

It is interesting to note that many of the above-cited papers on limits have been written in the wake 
of an article [||] which was promptly adopted by the PDG [Q] as the longed for ultimate solution to the 
problem, which could finally "remove an original motivation for the description of Bayesian intervals by 
the PDG" However, although Ref. thanks to the authority of the PDG, has been widely used by 
many experimental teams to publish limits, even by people who did not understand the method or were 
sceptical about it,(] that article has triggered a debate between those who simply object to the approach 
(e.g. Ref. [||]), those who propose other prescriptions (many of these authors do it with the explicit 
purpose of "avoiding Bayesian contaminations" 1 11] or of "giving a strong contribution to rid physics of 
Bayesian intrusions"^] and those who just propose to change radically the path 

The present contribution to the debate, based on Refs. [0, || [l(| 15, 19, 2^], is in the framework 
of what has been initially the physicists' approach to probability,^ and which I maintain [15] is still the 
intuitive reasoning of the large majority of physicists, despite the 'frequentistic intrusion' in the form 
of standard statistical courses in the physics curriculum. I will show by examples that an aseptic prior- 
free assessment of 'confidence' is a contradiction in terms and, consequently, that the solution to the 
problem of assessing 'objective' confidence limits does not exist. Finally, I will show how it is possible, 
nevertheless, to present search results in an objective (in the sense this committing word is commonly 
perceived) and optimal way which satisfies the desiderata expressed in Section ^ section. The price to 
pay is to remove the expression 'confidence limit' from our parlance and talk, instead, of 'sensitivity 
bound' to mean a prior-free result. Instead, the expression 'probabilistic bound' should be used to assess 
how much we are really confident , i.e. how much we believe, that the quantity of interest is above or 
below the bound, under clearly stated prior assumptions. 

The present paper focuses mostly on the 'difficult cases' [12], which will be classified as 'frontier 
measurements' [22], characterized by an 'open likelihood', as will be better specified in Section [7], where 
this situation will be compared to the easier case of 'close likelihood'. It will be shown why there are 
good reasons to present routinely the experimental outcome in two different ways for the two cases. 



2 DESIDERATA FOR AN OPTIMAL PRESENTATION OF SEARCH RESULTS 

Let us specify an optimal presentation of a search result in terms of some desired properties. 

• The way of reporting the result should not depend on whether the experimental team is more or 
less convinced to have found the signal looked for. 

• The report should allow an easy, consistent and efficient combination of all pieces of information 
which could come from several experiments, search channels and running periods. By efficient 
I mean the following: if many independent data sets each provide a little evidence in favour of 
the searched-for signal, the combination of all data should enhance that hypothesis; if, instead, 
the indications provided by the different data are incoherent, their combination should result in 
stronger constraints on the intensity of the postulated process (a higher mass, a lower coupling, 
etc.). 

• Even results coming from low sensitivity (and/or very noisy) data sets could be included in the 
combination, without them spoiling the quality of the result obtainable by the clean and high- 

4 This non-scientific practice has been well expressed by a colleague: "At least we have a rule, no matter if good or bad, 
to which we can adhere. Some of the limits have changed? You know, it is like when governments change the rules of social 
games: some win, some lose." When people ask me why I disagree with Ref. I just encourage them to read the paper 
carefully, instead of simply picking a number from a table. 

5 See Ref. [ p^ ] to get an idea of the present 'Bayesian intrusion' in the sciences, especially in those disciplines in which 
frequentistic methods arose. 

s Insightful historical remarks about the correlation physicists-'Bayesians' (in the modern sense) can be found in the first 
two sections of Chapter 10 of Jaynes' book [ Kip. For a more extensive account of the original approach of Laplace, Gauss and 
other physicists and mathematicians, see ReOM]. 
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sensitivity data sets alone. If the poor-quality data carry the slightest piece of evidence, this infor- 
mation should play the correct role of slightly increasing the global evidence. 
The presentation of the result (and its meaning) should not depend on the particular application 
(Higgs search, scale of contact-interaction, proton decay, etc.). 

The result should be stated in such a way that it cannot be misleading. This requires that it should 
easily map into the natural categories developed by the human mind for uncertain events. 
Uncertainties due to systematic effects of uncertain size should be included in a consistent and (at 
least conceptually) simple way. 

Subjective contributions of the persons who provide the results should be kept at a minimum. 
These contributions cannot vanish, in the sense that we have always to rely on the "understanding, 



critical analysis and integrity" [|23J of the experimenters but at least the dependence on the believed 
values of the quantity should be minimal. 

• The result should summarize completely the experiment, and no extra pieces of information (lu- 
minosity, cross-sections, efficiencies, expected number of background events, observed number of 
events) should be required for further analyses]] 

• The result should be ready to be turned into probabilistic statements, needed to form one's opinion 
about the quantity of interest or to take decisions. 

• The result should not lead to paradoxical conclusions. 

3 ASSESSING THE DEGREE OF CONFIDENCE 



As Barlow says [|24||, "Most statistics courses gloss over the definition of what is meant by probability, 
with at best a short mumble to the effect that there is no universal agreement. The implication is that such 
details are irrelevancies of concern only to long-haired philosophers, and need not trouble us hard-headed 
scientists. This is short-sighted; uncertainty about what we really mean when we calculate probabilities 
leads to confusion and bodging, particularly on the subject of confidence levels. . . . Sloppy thinking 
and confused arguments in this area arise mainly from changing one's definition of 'probability' in 
midstream, or, indeed, of not defining it clearly at all." Ask your colleagues how they perceive the 
statement "95% confidence level lower bound of 77.5 GeV/c 2 is obtained for the mass of the Standard 
Model Higgs boson" [^]. I conducted an extensive poll in July 1998, personally and by electronic mail. 



The result [ ]15| ] is that for the large majority of people the above statement means that "assuming the 
Higgs boson exists, we are 95% confident that the Higgs mass is above that limit, ij^ the Higgs boson 
has 95% chance (or probability) of being on the upper side, and 5% chance of being on the lower side,'|], 
which is not what the operational definition of that limit implies [9]. Therefore, following the suggestion 



of Barlow [24], let us "take a look at what we mean by the term 'probability' (and confidence) before 



discussing the serious business of confidence levels." I will do this with some examples, referring to 



Refs. [19, 20] for more extensive discussions and further examples. 



7 For example, during the work for Ref. [Jsp , we were unable to use only the 'results', and had to restart the analysis from the 
detailed pieces of information, which are not always as detailed as one would need. For this reason we were quite embarrassed 
when, finally, we were unable to use consistently the information published by one of the four LEP experiments. 

8 Actually, there were those who refused to answer the question because "it is going to be difficult to answer", and those 
who insisted on repeating the frequentistic lesson on lower limits, but without being able to provide a convincing statement 
understandable to a scientific journalist or to a government authority - these were the terms of the question - about the degree 
of confidence that the Higgs is heavier than the stated limit. I would like to report the latest reply to the poll, which arrived just 
the day before this workshop: "I apologize I never got around to answering your mail, which I suppose you can rightly regard 
as evidence that the classical procedures are not trivial!" 
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Fig. 1: A box has with certainty one of these six black and white ball compositions. The content of the box is 
inferred by extracting at random a ball from the box then returning it to the box. How confident are you initially of 
each composition? How does your confidence change after the observation of 1, 5 and 8 consecutive extractions 
of a black ball? 

31 Variations over a problem to Newton 

It seems0 that Isaac Newton was asked to solve the following problem. A man condemned to death has 
an opportunity of having his life saved and to be freed, depending on the outcome of an uncertain event. 
The man can choose between three options: a) roll 6 dice, and be free if he gets '6' with one and only 
one die (A); b) roll 12 dice, and be freed if he gets '6' with exactly 2 dice; c) roll 18 dice, and be freed 
if he gets '6' in exactly 3 dice. Clearly, he will choose the event about which he is more confident (we 
could also say the event which he considers more probable; the event most likely to happen; the event 
which he believes mostly; and so on). Most likely the condemned man is not able to solve the problem, 
but he certainly will understand Newton's suggestion to choose A, which gives him the highest chance 
to survive. He will also understand the statement that A is about six times more likely than B and thirty 
times more likely than C. The condemned would perhaps ask Newton to give him some idea how likely 
the event A is. A good answer would be to make a comparison with a box containing 1000 balls, 94 of 
which are white. He should be so confident of surviving as of extracting a white ball from the box;[^]i.e. 
9.4% confident of being freed and 90.6% confident of dying: not really an enviable situation, but better 
than choosing C, corresponding to only 3 white balls in the box. 

Coming back to the Higgs limit, are we really honestly 95% confident that the value of its mass 
is above the limit as we are confident that a neutralino mass is above its 95% C.L. limit, as a given 
branching ratio is below its 95% C.L. limit, etc., as we are confident of extracting a white ball from a 
box which contains 95 white and 5 black balls? 

Let us imagine now a more complicated situation, in which you have to make the choice (imagine 
for a moment you are the prisoner, just to be emotionally more involved in this academic exerciseQ. A 
box contains with certainty 5 balls, with a white ball content ranging from to 5, the remaining balls 
being black (see Fig. [I], and Ref. [^] for further variations on the problem.). One ball is extracted at 
random, shown to you, and then returned to the box. The ball is black. You get freed if you guess 
correctly the composition of the box. Moreover you are allowed to ask a question, to which the judges 
will reply correctly if the question is pertinent and such that their answer does not indicate with certainty 
the exact content of the box. 

Having observed a black ball, the only certainty is that H§ is ruled out. As far as the other five 
possibilities are concerned, a first idea would be to be more confident about the box composition which 
has more black balls (Ho), since this composition gives the highest chance of extracting this colour. 
Following this reasoning, the confidence in the various box compositions would be proportional to their 
black ball content. But it is not difficult to understand that this solution is obtained by assuming that the 
compositions are considered a priori equally possible. However, this condition was not stated explicitly 

9 My source of information is Ref. [p5||. It seems that Newton gave the 'correct answer' - indeed, in this stereotyped problem 
there is the correct answer. 

10 The reason why any person is able to claim to be more confident of extracting a white ball from the box that contains the 
largest fraction of white balls, while for the evaluation of the above events one has to 'ask Newton', does not imply a different 
perception of the 'probability' in the two classes of events. It is only because the events A, B and C are complex events, the 
probability of which is evaluated from the probability of the elementary events (and everybody can figure out what it means 
that the six faces of a die are equally likely) plus some combinatorics, for which some mathematical education is needed. 

"Bruno de Finetti used to say that either probability concerns real events in which we are interested, or it is nothing 
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in the formulation of the problem. How was the box prepared? You might think of an initial situation 
of six boxes each having a different composition. But you might also think that the balls were picked 
at random from a large bag containing a roughly equal proportion of white and black balls. Clearly, the 
initial situation changes. In the second case the composition Hq is initially so unlikely that, even after 



having extracted a black ball, it remains not very credible. As eloquently said by Poincare [27], "an 
effect may be produced by the cause a or by the cause b. The effect has just been observed. We ask 
the probability that it is due to the cause a. This is an a posteriori probability of cause. But I could not 
calculate it, if a convention more or less justified did not tell me in advance what is the priori probability 
for the cause a to come into play. I mean the probability of this event to some one who had not observed 
the effect." The observation alone is not enough to state how much one is confident about something. 

The proper way to evaluate the level of confidence, which takes into account (with the correct 
weighting) experimental evidence and prior knowledge, is recognized to be Bayes' theorem^ 

P(H i \E)<xP(E\H i )-P (H i ), (1) 

where E is the observed event (black or white), P (Hi) is the initial (or a priori) probability of Hi (called 
often simply 'prior'), P(iJj [ E) is the final (or 'posterior') probability, and P(E \ Hi) is the 'likelihood'. 
The upper plot of Fig. || shows the likelihood P(Black | Hi) of observing a black ball assuming each 
possible composition. The second pair of plots shows the two priors considered in our problem. The 
final probabilities are shown next. We see that the two solutions are quite different, as a consequence of 
different priors. So a good question to ask the judges would be how the box was prepared. If they say it 
was uniform, bet your life on Hq. If they say the five balls were extracted from a large bag, bet on H2. 

Perhaps the judges might be so clement as to repeat the extraction (and subsequent reintroduction) 
several times. Figure || shows what happens if five or height consecutive black balls are observed. The 
evaluation is performed by iterating Eq. ([[]): 

P n (H i \E)<xP{E n \H i )-P n ^ 1 (H i ). (2) 

If you are convinced^] that the preparation procedure is the binomial one (large bag), you still consider 
H\ more likely than Hq, even after five consecutive observations. Only after eight consecutive extrac- 
tions of a black ball are you mostly confident about Hq independently of how much you believe in the two 
preparation procedures (but, obviously, you might imagine - and perhaps even believe in - more fancy 
preparation procedures which still give different results). After many extractions we are practically sure 
of the box content, as we shall see in a while, though we can never be certain. 

Coming back to the limits, imagine now an experiment operated for a very short time at LEP200 
and reporting no four-jet events, no deuterons, no zirconium and no Higgs candidates (and you might add 
something even more fancy, like events with 100 equally energetic photons, or some organic molecule). 
How could the 95% upper limit to the rate of these events be the same? What does it mean that the 95% 
upper limit calculated automatically should give us the same confidence for all rates, independently of 
what the events are? 



32 Confidence versus evidence 

The fact that the same (in a crude statistical sense) observation does not lead to the same assessment of 
confidence is rather well understood by physicists: a few pairs of photons clustering in invariant mass 
around 135 MeV have a high chance of coming from a 7r°; more events clustering below 100 MeV are 
certainly background (let us consider a well calibrated detector); apeak in invariant mass in a new energy 

12 See Ref. for a derivation of Bayes' theorem based on the box problem we are dealing with. 

13 And if you have doubts about the preparation? The probability rules teach us what to do. Calling U (uniform) and 
B (binomial) the two preparation procedures, with probabilities P(U) and P(B), we have P(H | obs) = P(H | obs, U) ■ 
P{U) + P(H I obs, B) ■ P(B) . 
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Fig. 2: Confidence in the box contents as a function of prior and observation (see text). 
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domain might be seen as a hint of new physics, and distinguished theorists consider it worth serious 
speculation. The difference between the three cases is the prior knowledge (or scientific prejudice). Very 
often we share more or less the same prejudices, and consequently we will all agree on the conclusions. 
But this situation is rare in frontier science, and the same observation does not produce in all researchers 
the same confidence. A peak can be taken more or less seriously depending on whether it is expected, 
it fits well in the overall theoretical picture, and does not contradict other observations. Therefore it is 
important to try to separate experimental evidence from the assessments of confidence. This separation 
is done in a clear and unambiguous way in the Bayesian approach. Let us illustrate it by continuing with 
the box example. Take again Eq. (|l|). Considering any two hypotheses Hi and Hj, we have the following 
relation between prior and posterior betting odds: 

P(Hj\E) P{E | Hj) P (Hj) 
P(Hj\E) P{E\H£ P {H 3 )- 

Bayes factor 

This way of rewriting the Bayes's theorem shows how the final odds can be factorized into prior odds 
and experimental evidence, the latter expressed in terms of the so-called Bayes factor [^]. The 15 odds 
of our example are not independent, and can be expressed with respect to a reference box composition 
which has a non-null likelihood. The natural choice to analyse the problem of consecutive black ball 
extractions is 

. PfBlack I Hi) 

which is, in this particular case, numerically identical to P(Black | Hi), since P(Black | Hq) = 1, and 
then it can be read from the top plot of Fig. |2[ The function 1Z can be seen as a 'relative belief updating 



ratio' [JlOp, in the sense that it tells us how the beliefs must be changed after the observation, though it 
cannot determine univocally their values. Note that the way the update is done is, instead, univocal and 
not subjective, in the sense that Bayes' theorem is based on logic, and rational people cannot disagree. It 
is also obvious what happens when many consecutive back balls are observed. The iterative application 
of Bayes' theorem [Eq. (Eh] leads to the following overall 1Z: 



K{H ; Black, n) 



P (Black | Hi 



(5) 



.P(Black|# 

For large n all the odds with respect to Hq go to zero, i.e. P(Hq — > . 

We have now our logical and mathematical apparatus ready. But before moving to the problem 
of interest, let us make some remarks on terminology, on the meaning of subject probability, and on its 
inteiplay with odds in betting and expected frequencies. 



33 Confidence, betting odds and expected frequencies 

I have used on purpose several words and expressions to mean essentially the same thing: likely, proba- 
ble, credible, (more or less) possible, plausible, believable, and their associated nouns; to be more or less 
confident about, to believe more or less, to trust more or less, something, and their associated nouns; to 
prefer to bet on an outcome rather than another one, to assess betting odds, and so on. I could also use 
expressions involving expected frequencies of outcomes of apparently similar situations. The perception 
of probability would remain the same, and there would be no ambiguities or paradoxical conclusions. I 
refer to Ref. [ 5fj| ] for a more extended, though still concise, discussion on the terms. I would like only to 
sketch here some of the main points, as a summary of the previous sections. 

• The so-called subjective probability is based on the acknowledgement that the concept of probabil- 
ity is primitive, i.e. it is meant as the degree of belief developed by the human mind in a condition 
of uncertainty, no matter what we call it (confidence, belief, probability, etc) or how we evaluated it 
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(symmetry arguments, past frequencies, Bayes' theorem, quantum mechanics formulae [29], etc.). 
Some argue that the use of beliefs is not scientific. I believe, on the other hand, that "it is scientific 
only to say what is more likely and what it is less likely" [30]. 

• The odds in an 'coherent bet' (a bet such that the person who assesses its odds has no preference in 
either direction) can be seen as the normative rule to force people to assess honestly their degrees 
of belief 'in the most objective way' (as this expression is usually perceived). This is the way that 
Laplace used to report his result about the mass of Saturn: "it is a bet of 10,000 to 1 that the error 
of this result is not l/100fh of its values" (quote reported in Ref. [31]). 

• Probability statements have to satisfy the basic rules of probability, usually known as axioms. 
Indeed, the basic rules can be derived, as theorems, from the operative definition of probability 
through a coherent bet. The probability rules, based on the axioms and on logic's rules, allows the 
probability assessments to be propagated to logically connected events. For example, if one claims 
to be xx% confident about E, one should feel also (100 — xx)% confident about E. 

• The simple, stereotyped cases of regular dice and urns of known composition can be considered 
as calibration tools to assess the probability, in the sense that all rational people will agree. 

• The probability rules, and in particular Bernoulli's theorem, relate degrees of belief to expected 
frequencies, if we imagine repeating the experiment many times under exactly the same conditions 
of uncertainty (not necessarily under the same physical conditions). 

• Finally, Bayes' theorem is the logical tool to update the beliefs in the light of new information. 

As an example, let us imagine the event E, which is considered 95% probable (and, necessarily, the 
opposite event E is 5% probable). This belief can be expressed in many different ways, all containing 
the same degree of uncertainty : 

• I am 95% confident about E and 5% confident about E. 

• Given a box containing 95 white and 5 black balls, I am as confident that E will happen, as that 
the colour of the ball will be white. I am as confident about E as of extracting a black ball. 

• I am ready to place a 19:1 beQ on E, or a 1: 19 on E. 

• Considering a large number n of events Ei, even related to different phenomenology and each 
having 95% probability, I am highly confident^] that the relative frequency of the events which 
will happen will be very close to 95% (the exact assessment of my confidence can be evaluated 
using the binomial distribution). If n is very large, I am practically sure that the relative frequency 
will be equal to 95%, but I am never certain, unless n is 'infinite', but this is no longer a real 
problem, in the sense of the comment in footnote 11 ("In the long run we are all dead" [[32|]). 

Is this how our confidence limits from particle searches are perceived? Are we really 5% confident that 
the quantity of interest is on the 5% side of the limit? Isn't it strange that out of the several thousand 
limits from searches published in recent decades nothing has ever shown up on the 5% side? In my 
opinion, the most embarrassing situation comes from the Higgs boson sector. A 95% C.L. upper limit is 
obtained from radiative corrections, while a 95% C.L. limit comes from direct search. Both results are 
presented with the same expressions, only 'upper' being replaced by 'lower'. But their interpretation is 
completely different. In the first case it is easy to show [|34j] that, using the almost parabolic result of 
the x 2 fit m ^(Mh) and uniform prior in ln(M#), we can really talk about '95% confidence that the 
mass is below the limit', or that 'the Higgs mass has equal chance of being on either side of the value 

14 See Ref. [ ^5| for comments on decision problems involving subjectively-relevant amounts of money. 

15 It is in my opinion very important to understand the distinction between the use of this frequency-based expression of 
probability and frequentistic approach (see comments in Refs. [ pc| ] and |p^|) or frequentistic coverage (see Section 8.6 of 
Ref. |p^|]). I am pretty sure that most physicists who declare to be frequentist do so on the basis of educational conditioning and 
because they are accustomed to assessing beliefs (scientific opinion, or whatever) in terms of expected frequencies. The crucial 
point which makes the distinction is it to ask oneself if it is sensible to speak about probability of true values, probability of 
theories, and so on. There is also a class of sophisticated people who think there are several probabilities. For comments on 
this latter attitude, see Section 8.1 of Ref. [H. 
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of minimum x 2 \ and so on, in the sense described in this section. This is not true in the second case. 
Who is really 5% confident that the mass is below the limit? How can we be 95% confident that the 
mass is above the limit without an upper bound? Non-misleading levels of confidence on the statement 
Mh > M can be assessed only by using the information coming from precision measurement, which 
rules out very large (and also very small) values of the Higgs mass (see Refs. [|, 33|, 34|]. For example, 
when we say [[P|] that the median of the Higgs mass p.d.f. is 150 GeV, we mean that, to best of our 
knowledge, we regard the two events Mh < 150 and Mh > 150 as equally likely, like the two faces 
of a regular coin. Following Laplace, we could state our confidence claiming that 'is a bet of 1 to 1 that 
M H is below 150 GeV. 



4 INFERRING THE INTENSITY OF POISSON PROCESSES AT THE LIMIT OF THE DE- 
TECTOR SENSITIVITY AND IN THE PRESENCE OF BACKGROUND 

As a master example of frontier measurement, let us take the same case study as in Ref. [Jl0|]. We shall 
focus then on the inference of the rate of gravitational wave (g.w.) bursts measured by coincidence 
analysis of g.w. antennae. 



41 Modelling the inferential process 

Moving from the box example to the more interesting physics case of g.w. burst is quite straightforward. 
The six hypotheses Hi, playing the role of causes, are now replaced by the infinite values of the rate r. 
The two possible outcomes black and white now become the number of candidate events (n c ). There is 
also an extra ingredient which comes into play: a candidate event could come from background rather 
than from g.w.'s (like a black ball that could be extracted by a judge-conjurer from his pocket rather than 
from the box. . . ). Clearly, if we understand well the experimental apparatus, we must have some idea of 
the background rate r&. Otherwise, it is better to study further the performances of the detector, before 
trying to infer anything. Anyhow, unavoidable residual uncertainty on r& can be handled consistently 
(see later). Let us summarize our ingredients in terms of Bayesian inference. 

• The physical quantity of interest, and with respect to which we are in the state of greatest uncer- 
tainty, is the g.w. burst rate r. 

• We are rather sure about the expected rate of background events r& (but not about the number of 
events due to background which will actually be observed). 

• What is certain^] is the number n c of coincidences which have been observed. 

• For a given hypothesis r the number of coincidence events which can be observed in the observa- 
tion time T is described by a Poisson process having an intensity which is the sum of that due to 
background and that due to signal. Therefore the likelihood is 

e -{r+r b )T/f , \ j,\n c 

P{n c | r, r b ) = f(n c \ r, r b ) = U , b> ' . (6) 

Bayes' theorem applied to probability functions and probability density functions (we use the same 
symbol for both), written in terms of the uncertain quantities of interest, is 

f(r | n c ,r b ) oc f(n c \ r,r b ) ■ f (r) . (7) 

At this point, it is now clear that if we want to assess our confidence we need to choose some prior. We 
shall come back to this point later. Let us see first, following the box problem, how it is possible to make 
a prior-free presentation of the result. 

16 Obviously the problem can be complicated at will, considering for example that n c was communicated to us in a way, or 
by somebody, which/who is not 100% reliable. A probabilistic theory can include this possibility, but this goes beyond the 
purpose of this paper. See e.g. Ref. [|3 5|] for further information on probabilistic networks. 



9 



42 Prior-free presentation of the experimental evidence 

Also in the continuous case we can factorize the prior odds and experimental evidence, and then arrive 
at an IZ-f unction similar to Eq. (||): 

n(r . n r A- fMr,r b ) 

U{ ' " b) ~ f(n c \r = 0,r b )- (8) 

The function 1Z has nice intuitive interpretations which can be highlighted by rewriting the ^-function 
in the following way [see Eq. (f7|)] : 



tt,/ \ f(n c \r,r b ) f(r\n c ,r b ) //(r = \n c ,r b ) 

1Z(r; n c , r b ) = — — = — / — — . (9) 

f{n c \r = 0,r b ) f {r) / f a (r = 0) 

1Z has the probabilistic interpretation of 'relative belief updating ratio', or the geometrical interpretation 
of 'shape distortion function' of the probability density function. 1Z goes to 1 for r — > 0, i.e. in the 
asymptotic region in which the experimental sensitivity is lost. As long as it is 1, the shape of the p.d.f. 
(and therefore the relative probabilities in that region) remains unchanged. In contrast, in the limit 1Z — > 
(for large r) the final p.d.f. vanishes, i.e. the beliefs go to zero no matter how strong they were before. 
For the Poisson process we are considering, the relative ^.-function becomes 

/ r \ n ° 

1Z(r;n c ,r b ,T) = e~ rT + ~J > (10) 

with the condition r b > if n c > 0. The case r b = n c = yields lZ(r) = e~ r , obtainable starting 
directly from Eq. (|J) and Eq. (|6]). Also the case r b — > oo has to be evaluated directly from the definition 
of 1Z and from the likelihood, yielding 1Z = 1 Vr. Finally, the case r b = and n c > makes r = 
impossible, thus making the likelihood closed also on the left side (see Section 0). In this case the 
discovery is certain, though the exact value of r can be still rather uncertain. Note, finally, that if n c = 
the TZ-f unction does not depend on r b , which might seem a bit surprising at a first sight (I confess that 
have been puzzled for years about this result which was formally correct, though not intuitively obvious. 
Pia Astone has finally shown at this workshop that things must go logically this way [^] . ) 

A numerical example will illustrate the nice features of the 7£-function. Consider T as unit time 
(e.g. one month), a background rate r b such that r b X T = 1, and the following hypothetical observations: 
n c = 0; n c = 1; n c = 5. The resulting ^-functions are shown in Fig. ||. The abscissa has been drawn in 
a log scale to make it clear that several orders of magnitude are involved. These curves transmit the result 
of the experiment immediately and intuitively. Whatever one's beliefs on r were before the data, these 
curves show how one must change them. The beliefs one had for rates far above 20 events/month are 
killed by the experimental result. If one believed strongly that the rate had to be below 0. 1 events/month, 
the data are irrelevant. The case in which no candidate events have been observed gives the strongest 
constraint on the rate. The case of five candidate events over an expected background of one produces a 
peak of 1Z which corroborates the beliefs around 4 events/month only if there were sizable prior beliefs 



in that region (the question of whether do g.w. bursts exist at all is discussed in Ref. [|10|]). 

Moreover there are some computational advantages in reporting the 7*!,-function as a result of a 
search experiment: The comparison between different results given by the ^-function can be perceived 
better than if these results were presented in terms of absolute likelihood. Since 1Z differs from the 
likelihood only by a factor, it can be used directly in Bayes' theorem, which does not depend on constant 
factors, whenever probabilistic considerations are needed: The combination of different independent 
results on the same quantity r can be done straightforwardly by multiplying individual 1Z functions; note 
that a very noisy and/or low-sensitivity data set results in 1Z = 1 in the region where the good data sets 
yield an 1Z- value varying from 1 to 0, and then it does not affect the result. One does not need to decide a 
priori if one wants to make a 'discovery' or an 'upper limit' analysis: the 7£-function represents the most 
unbiased way of presenting the results and everyone can draw their own conclusions. 
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Fig. 3: Relative belief updating ratio Ws for the Poisson intensity parameter r, in units of events per month eval- 
uated from an expected rate of background events = 1 event/month and the following numbers of observed 
events: (continuous); 1 (dashed); 5 (dotted). 



Finally, uncertainty due systematic effects (expected background, efficiency, cross-section, etc.) 
can be taken into account in the likelihood using the laws of probability [ |l0| ] (see also Ref. [37]). 



5 SOME EXAMPLES OF ^-FUNCTION BASED ON REAL DATA 

The case study described till now is based on a toy model simulation. To see how the proposed method 
provides the experimental evidence in a clear way we show in Figs. ^ and |5] ^-functions based on real 
data. The first is a reanalysis of Higgs search data at LEP ^ ; the second comes from the search for 



contact interactions at HERA made by ZEUS [38]. The extension of Eq. (pb to the most general case is 



Rfedata) = /."ff 1 "' , (ID 
/(data | fijjjs) 

where /U ms stands for the asymptotic insensitivity value (0 or oo, depending on the physics case) of the 
generic quantity fj,. Figures || and |5] show clearly what is going on, namely which values are practically 
ruled out and which ones are inaccessible to the experiment. The same is true for the result of a neutrino 



oscillation experiment reported two-dimensional 7£-function [39] (see also Ref. H9j). 



6 SENSITIVITY BOUND VERSUS PROBABILISTIC BOUND 

At this point, it is rather evident from Figs. ||, ^| and |5] how we can summarize the result with a single 
number which gives an idea of an upper or lower bound. In fact, although the 7£-function represents the 
most complete and unbiased way of reporting the result, it might also be convenient to express with just 
one number the result of a search which is considered by the researchers to be unfruitful. This number 
can be any value chosen by convention in the region where 1Z has a transition from 1 to 0. This value 
would then delimit (although roughly) the region of the values of the quantity which are definitively 
excluded from the region in which the experiment can say nothing. The meaning of this bound is not 
that of a probabilistic limit, but of a walQ which separates the region in which we are, and where we 
see nothing, from the the region we cannot see. We may take as the conventional position of the wall the 
point where lZ{ri) equals 50%, 5% or 1% of the insensitivity plateau. What is important is not to call 

17 In most cases it is not a sharp solid wall. A hedge might be more realistic, and indeed more poetic: "Sempre cam mifu 
quell 'ermo colle, / E questa siepe, die da tanta parte / Dell' ultimo orizzonte il guardo esciude" (Giacomo Leopardi, L'InSnito). 
The exact position of the hedge doesn't really matter, if we think that on the other side of the hedge there are infinite orders of 
magnitude to which we are blind. 
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Fig. 4: 1Z- function reporting results on Higgs direct search from the reanalysis of Ref. ||]. A, D and O stand 
for ALEPH, DELPHI and OPAL. Their combined result is indicated by LEP 3 . The full combination (LEP4) was 
obtained by assuming for L3 a behaviour equal to the average of the others experiments. 
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this value a bound at a given probability level (or at a given confidence level - the perception of the result 
by the user will be the same! [15]). A possible unambiguous name, corresponding to what this number 
indeed is, could be 'standard sensitivity bound'. As the conventional level, our suggestion is to choose 
K = 0.05 [0]. 

Note that it does not make much sense to give the standard sensitivity bound with many significant 
digits. The reason becomes clear by observing Figs. in particular Fig. ||. I don't think that there 
will be a single physicist who, judging from the figure, believes that there is a substantial difference 
concerning the scale of a postulated contact interaction for e = +1 and e = — 1. Similarly, looking 
at Fig. g| the observation of events, instead of 1 or 2, should not produce a significant modification 
of our opinion about g.w. burst rates. What really matters is the order of magnitude of the bound or, 
depending on the problem, the order of magnitude of the difference between the bound and the kinematic 
threshold (see discussion in Sections 9.1.4 and 9.3.5 of Ref. [|l^]). I have the impression that often the 
determination of a limit is considered as important as the determination of the value of a quantity. A 
limit should be considered on the same footing as an uncertainty, not as a true value. We can, at least in 
principle, improve our measurements and increase the accuracy on the true value. This reasoning cannot 
be applied to bounds. Sometimes I have the feeling that when some talk about a '95% confidence limit', 
they think as if they were '95% confident about the limit'. It seems to me that for this reason some are 
disappointed to see upper limits on the Higgs mass fluctuating, in contrast to lower limits which are more 
stable and in constant increase with the increasing available energy. In fact, as said above, these two 95% 
C.L. limits don't have the same meaning. It is quite well understood by experts that lower 95% C.L. 
limits are in practice « 100% probability limits, and they are used in theoretical speculations as certainty 
bounds (see e.g. Ref. [^3|]). 

I can imagine that at this point there are still those who would like to give limits which sound 
probabilistical. I hope that I have convinced them about the crucial role of prior, and that it is not 
scientific to give a confidence level which is not a 'level of confidence'. In Ref. [10] you will find a 
long discussion about role and quantitative effect of priors, about the implications of uniform prior and 
so-called Jeffreys' prior, and about more realistic priors of experts. There, it has also been shown that 
(somewhat similar to of what was said in the previous section) it is possible to choose a prior which 
provides practically the same probabilistic result acceptable to all those who share a similar scientific 
prejudice. This scientific prejudice is that of the 'positive attitude of physicists' [19], according to which 
rational and responsible people who have planned, financed and run an experiment, consider they have 
some reasonable chance to observe something.^ It is interesting that, no matter how this 'positive 
attitude' is reasonably modelled, the final p.d.f. is, for the case of g.w. bursts (/^ ns = 0), very similar to 
that obtained by a uniform distribution. Therefore, a uniform prior could be used to provide some kind 
of conventional probabilistic upper limits, which could look acceptable to all those who share that kind 
of positive attitude. But, certainly, it is not possible to pretend that these probabilistic conclusions can 
be shared by everyone. Note that, however, this idea cannot be applied in a straightforward way in case 
/x ms = oo, as can be easily understood. In this case one can work on a sensible conjugate variable (see 
next section) which has the asymptotic insensitivity limit at 0, as happens, for example, with e/A 2 in the 
case of a search for contact interaction, as initially proposed in Refs. [H^,H and still currently done (see 



e.g. Ref. Q38|]). Ref. [ ]42[ ] contains also the basic idea of using a sensitivity bound, though formulated 
differently in terms of 'resolution power cut-off . 



18 In some cases researchers are aware of having very little chance of observing anything, but they pursue the research to refine 
instrumentation and analysis tools in view of some positive results in the future. A typical case is gravitational wave search. In 
this case it is not scientifically correct to provide probabilistic upper limits from the current detectors, and the honest way to 
provide the result is that described here H4(J], However, some could be tempted to use a frequentistic procedure which provided 
an 'objective' upper limit 'guaranteed' to have a 95% coverage. This behaviour is irresponsible since these researchers are 
practically sure that the true value is below the limit. Loredo shows in Section 3.2 of Ref. [[fl} an instructive real-live example 
of a 90% C.I. which certainly does not contain the true value (the web site [ pT| contains several direct comparisons between 
frequentistic versus Bayesian results.). 
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7 OPEN VERSUS CLOSED LIKELIHOOD 

Although the extended discussion on priors has been addressed elsewhere Figs. [3), || and || show 
clearly the reason that frontier measurements are crucially dependent on priors: the likelihood only 
vanishes on one side (let us call these measurements 'open likelihood'). In other cases the likelihood 
goes to zero in both sides (closed likelihood). Normal routine measurements belong to the second class, 
and usually they are characterized by a narrow likelihood, meaning high precision. Most particle physics 
measurements belong to the class of closed priors. I am quite convinced that the two classes should 
be treated routinely differently. This does not mean recovering frequentistic 'flip-flop' (see Ref. |Q] 
and references therein), but recognizing the qualitative, not just quantitative, difference between the two 
cases, and treating them differently. 

When the likelihood is closed, the sensitivity on the choice of prior is much reduced, and a prob- 
abilistic result can be easily given. The subcase better understood is when the likelihood is very narrow. 
Any reasonable prior which models the knowledge of the expert interested in the inference is practically 
constant in the narrow range around the maximum of the likelihood. Therefore, we get the same result 
obtained by a uniform prior. However, when the likelihood is not so narrow, there could still be some 
dependence on the metric used. Again, this problem has no solution if one considers inference as a math- 
ematical game [22]. Things are less problematic if one uses physics intuition and experience. The idea 
is to use a uniform prior on the quantity which is 'naturally measured' by the experiment. This might 
look like an arbitrary concept, but is in fact an idea to which experienced physicists are accustomed. 
For example, we say that 'a tracking devise measures l/p\ 'radiative corrections measure log(M#)', 
'a neutrino mass experiment is sensitive to m 2 ', and so on. We can see that our intuitive idea of 'the 
quantity really measured' is related to the quantity which has a linear dependence on the observation(s). 
When this is the case, random (Brownian) effects occurring during the process of measurement tend to 
produce a roughly Gaussian distribution of observations. In other words, we are dealing with a roughly 
Gaussian likelihood. So, a way to state the natural measured quantity is to refer to the quantity for which 
the likelihood is roughly Gaussian. This is the reason why we are used do least-square fits choosing the 
variable in which the x 2 is parabolic (i.e. the likelihood is normal) and then interpret the result as proba- 
bility of the true value. In conclusion, having to give a suggestion, I would recommend continuing with 
the tradition of considering natural the quantity which gives a roughly normal likelihood. For example, 
this was the original motivation to propose e/A 2 to report compositeness results [42]. 



This uniform-prior/Gaussian-likelihood duality goes back to Gauss himself [|44[]. In fact, he de- 
rived his famous distribution to solve an inferential problem using what we call nowadays the Bayesian 
approach. Indeed, he assumed a uniform prior for the true value (as Laplace did) and searched for the 
analytical form of the likelihood such as to give a posterior p.d.f. with most probable^ value equal to 
the arithmetic average of the observation. The resulting function was . . . the Gaussian. 

When there is not an agreement about the natural quantity one can make a sensitivity analysis of 
the result, as in the exercise of Fig. ^, based on Ref. [34]. If one chooses a prior flat in m#, rather than 
in log(m#), the p.d.f.'s given by the continuous curves change into the dashed ones. Expected value 
and standard deviation of the distributions (last digits in parentheses) change as follows. For (Aa) = 
0.02804(65), M H = 0.10(7) TeV becomes M H = 0.14(9) TeV, while for (Aa) = 0.02770(65) 
Mh = 0.12(6) TeV becomes Mh = 0.15(7) TeV. Although this is just an academic exercise, since 
it is rather well accepted that radiative corrections measure log(Mff), Fig. || and the above digits show 
that the result is indeed rather stable: 0.15(9) ps 0.10(7) and 0.15(7) « 0.12(6), though perhaps some 
numerologically-oriented colleague would disagree. 

If a case is really controversial, one can still show the likelihood. But it is important to understand 
that a likelihood is not yet the probabilistic result we physicists want. If only the likelihood is published, 

"Note that also speaking about the most probable value is close to our intuition, although all values have zero probability. 
See comments in Section 4.1.2 of Ref. m%. 
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Fig. 6: Sensitivity analysis exercise from the indirect Higgs mass determination of Ref. [[34]]. Solid lines and dashed 
lines are obtained with priors uniform in log(m# ) and mj, respectively. 



the risk it is too high that it will considered anyway and somehow as a probabilistic result, as happens now 
in practice. For this reason, I think that, at least in the rather simple case of closed likelihood, those who 
perform the research should take their responsibility and assess expected value and standard deviation 
that they really believe, plus other information in the case of a strongly non-Gaussian distribution [|8, 34 



370. I do not think that, in most applications, this subjective ingredient is more relevant than the many 
other subjective choices made during the experimental activity and that we have accept anyhow. In my 
opinion, adhering strictly to the point of view that one should refrain totally from giving probabilistic 
results because of the idealistic principle of avoiding the contribution of personal priors will halt research. 
We always rely on somebody else's priors and consult experts. Only a perfect idiot has no prior, and he 
is not the best person to consult. 

8 OVERALL CONSISTENCY OF DATA 

One of the reasons for confusion with confidence levels is that the symbol 'C.L.' is not only used in 
conjunction with confidence intervals, but also associated with results of a fits, in the sense of statistical 



significance (see e.g. Ref. [|4|]). As I have commented elsewhere Q15 , |19|], the problem coming from 
the misinterpretation of confidence levels are much more severe than than what happens considering 
confidence intervals probabilistic intervals. Sentences like "since the fit to the data yields a 1% C.L., 
the theory has a 1% chance of being correct" are rather frequent. Here I would like only to touch some 
points which I consider important. 

Take the x 2 , certainly the most used test variable in particle physics. As most people know from 
the theory, and some from having had bad experiences in practice, the x 2 is not what statisticians call a 
'sufficient statistics'. This is the reason why, if we see a discrepancy in the data, but the x 2 doesn't say 
so, other pieces of magic are tried, like changing the region in which the % 2 is applied, or using a 'run 
test' , Kolmogorov test, and so orQ (but, "if I have to draw conclusions from a test with a Russian name, 
it is better I redo the experiments", somebody once said). My recommendation is to give always a look at 
the data, since the eye of the expert is in most simple (i.e. low-dimensional) cases better that automatic 
tests (it is also not a mystery that tests are done with the hope they will prove what one sees. . . ). 

I think that x 2 , as other variables, can be used cum grano salisf^] to spot a possible problem of the 
experiment, or hints of new physics, which one certainly has to investigate. What is important is to be 
careful before drawing conclusions only from the crude result of the test. I also find it important to start 
calling things by their name in our community too and call 'P-value' the number resulting from the test, 



20 Everybody has experienced endless discussions on what I call all-together x 2 -ology, to decide if there is some effect. 
21 See Section 8.8 of Ref. mM for a discussion about why frequentistic tests 'often work'. 
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as is currently done in modern books of statistics (see e.g. [|45|]). It is recognized by statisticians that 



P- values also tend to be misunderstood [|18|,|46[], but at least they have a more precise meaning | |47| ] than 
our ubiquitous C.L.'s. 

The next step is what to do when, no matter how, one has strong doubts about some anomaly. Good 
experimentalists know their job well: check everything possible, calibrate the components, make special 
runs and Monte Carlo studies, or even repeat the experiment, if possible. It is also well understood that 
it is not easy to decide when to stop making studies and applying corrections. The risk to influencing a 
result is always present. I don't think there is any general advice that that can be given. Good results 
come from well-trained (prior knowledge!) honest physicists (and who are not particularly unlucky. . . ). 

A different problem is what to do when we have to use someone else's results, about which we do 
not have inside knowledge, for example when we make global fits. Also in this case I mistrust automatic 
prescriptions [f|]. In my opinion, when the data points appear somewhat inconsistent with each other 
(no matter how one has formed this opinion) one has to try to model one's scepticism. Also in this 



case, the Bayesian approach offers valid help [W8, 19]. In fact, since one can assign probability to every 



piece of information which is not considered certain, it is possible to build a so-called probabilistic 



network [35], or Bayesian network, to model the problem and find the most likely solution, given well- 



stated assumptions. A first application of this reasoning in particle physics data (though the problem was 



too trivial to build up a probabilistic network representation) is given in Ref. [50], based on an improved 



version of Ref. [49]. 



9 CONCLUSION 

So, what is the problem? In my opinion the root of the problem is the frequentistic intrusion into the 
natural approach initially followed by 'classical' physicists and mathematicians (Laplace, Gauss, etc.) to 
solve inferential problems. As a consequence, we have been taught to make inferences using statistical 
methods which were not conceived for that purpose, as insightfully illustrated by a professional statis- 
tician at the workshop [|J]. It is a matter of fact that the results of these methods are never intuitive 



(though we force the 'correct' interpretation using out intuition [15]), and fail any time the problem is 



not trivial. The problem of the limits in 'difficult cases' is particularly evident, because these methods 



fail [p2[]. But I would like to remember that also in simpler routine problems, like uncertainty propaga- 
tion and treatment of systematic effects, conventional statistics do not provide consistent methods, but 
only a prescription which we are supposed to obey. 



What is the solution? As well expressed in Ref. [53], sometimes we cannot solve a problem 



because we are not able to make a real change, and we are trapped in a kind of logical maze made by 



many solutions, which are not the solution. Ref. [53] talks explicitly of non-solutions forming a kind of 



group structure. We rotate inside the group, but we cannot solve the problem until we break out of the 
group. I consider the many attempts to solve the problem of the confidence limit inside the frequentistic 
framework as just some of the possible group rotations. Therefore the only possible solution I see is to 
get rid of frequentistic intrusion in the natural physicist's probabilistic reasoning. This way out, which 
takes us back the 'classicals', is offered by the statistical theory called Bayesian, a bad name that gives 
the impression of a religious sect to which we have to become converted (but physicists will never be 



Bayesian, as they are not Fermian or Einsteinian [ 15] - why should they be Neymanian or Fisherian?). I 



consider the name Bayesian to be temporary and just in contrast to 'conventional'. 

I imagine, and have experienced, much resistance to this change due to educational, psychological 
and cultural reasons (not forgetting the sociological ones, usually the hardest ones to remove). For 
example, a good cultural reason is that we consider, in good faith, a statistical theory on the same footing 
as a physical theory. We are used to a well-established physical theory being better than the previous 
one. This is not the case of the so-called classical statistical theory, and this is the reason why an 



increasing number of statisticians and scientists Q18Q have restarted from the basic ideas of 200 years 
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ago, complemented with modern ideas and computing capability flTi|, ^6|, [Tj], f?5|, [HI, |54|] . Also in physics 
things are moving, and there are many now who oscillate between the two approaches, saying that both 
have good and bad features. The reason I am rather radical is because I do not think we, as physicists, 
should care only about numbers, but also about their meaning: 25 is not approximatively equal to 26, if 
25 is a mass in kilogrammes and 26 a length in metres. In the Bayesian approach I am confident of what 
numbers mean at every step, and how to go further. 

I also understand that sometimes things are not so obvious or so highly intersubjective, as an anti- 
Bayesian joke says: "there is one obvious possible way to do things, it's just that they can't agree on 
it." I don't consider this a problem. In general, it is just due to our human condition when faced with 
the unknown and to the fact that (fortunately!) we do not have an identical status of information. But 
sometimes the reason is more trivial, that is we have not worked together enough on common problems. 
Anyway, given the choice between a set of prescriptions which gives an exact ('objective') value of 
something which has no meaning, and a framework which gives a rough value of something which has a 
precise meaning, I have no doubt which to choose. 

Coming, finally, to the specific topic of the workshop, things become quite easy, once we have 
understood why an objective inference cannot exist, but an 'objective' (i.e. logical) inferential framework 
does. 

• In the case of open likelihood, priors become crucial. The likelihood (or the ^-function) should 
always be reported, and a non-probabilistic sensitivity bound should be given to summarize the 
negative search with just a number. A conventional probabilistic result can be provided using a 
uniform prior in the most natural quantity. Reporting the results with the 7*!.-function satisfies the 
desiderata expressed in this paper. 

• In the case of closed likelihood, a uniform prior in the natural quantity provides probabilistic results 
which can be easily shared by the experts of the field. 

As a final remark, I would like to recommend calling things by their name, if this name has a precise 
meaning. In particular: sensitivity bound if it is just a sensitivity bound, without probabilistic meaning; 
and such and such percent probabilistic limit, if it really expresses the confidence of the person(s) who 
assesses it. As a consequence, I would propose not to talk any longer about 'confidence interval' and 
'confidence level', and to abandon the abbreviation 'C.L.'. So, although it might look paradoxical, I 
think that the solution to the problem of confidence limits begins with removing the expression itself. 
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